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Abstract 

This paper describes in details the first version of Morphonette, a new French morpho- 
logical resource and a new radically lexeme-based method of morphological analysis. This 
research is grounded in a paradigmatic conception of derivational morphology where the 
morphological structure is a structure of the entire lexicon and not one of the individual 
^^' words it contains. The discovery of this structure relies on a measure of morphological 

similarity between words, on formal analogy and on the properties of two morphological 
paradigms: morphological derivational families and morphological derivational series. 



^ ■ 1 Paradigmatic derivational morphology 

. 1 ■ The starting p oints of this research are the fundamental ideas of lexeme-based morphology 



r J I ( Aronofj . 119941 1: only lexemes are signs (i.e. atomic units); affixes are merely phonologial marks; 



the construction of the meaning and of the form of a derived word are distinct processes. It is 
(J I grounded in a conception of derivational morphology where words do not have a morphological 

structure and where this structure is a level of organization of the lexicon. This organization is 
based on the se mantic , form al and categorical relations that hold between the words memorized 



^ i in the lexicon (JBvbed . I1995I ). Among these relations, analogies play a prominent role because 

fSJ ' they allow the emergence of the morphological paradigms. An analogy is a quaternary relations 

^D ■ a : b :: c : d that holds between the members of a quadruplet (a, b, c, d) such that a is to 6 as c is to 

O^ ' d. Morphological derivational analogies holds between the members of two types of paradigms : 

; . morphological derivational families and morphological derivational series. This can be illustrated 

'^ ' with an analogy such as duplication : duplicateur : : unification : unificatcur^ where we can see 



that duplication and duplicateur belong to the same derivational family and that it goes the 
same for unification and unificateur. This conception enables us to redefine the morphological 
analysis task, which aims to make explicit the morphological paradigms of the lexicon instead 
of decompose the individual words into morphemes. This organization is illustrated in figure [T] 
^^ . The analysis of a given word then consists in identifying its position in the morphological struc- 

5h ' ture of the lexicon. For instance, the word rectificateur 'recitifier' is not analyzed as in ^ but 

as a member of the derivational family which contains rectifiable, rectifier 'rectify', rectifieur 
'recitifier', rectification, rectificatif 'corrective', etc. and of the derivational series which contains 
certificateur 'certifier', fructificateur 'which bears fruits', modificateur 'modifier', sanctificateur 
'sanctifier', etc. These two sets can be seen as the morphological coordinates of rectificateur. 

(1) ^^ 

V -eur 

I 
rectifi(cat) 



^'duplication', 'duplicator', 'unification', 'unifier' 
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Figure 1: The morphological network of the French lexicon is made up of derivational families 
and derivational series. Families and series are connected by morphological analogies. 



The objective of the present research is twofold: first, we propose a radically lexeme-based 
method of morphological analysis capable of providing the morphological derivational structure 
of the lexicon; second, we have computed this structure for a significant fragment of a large- 
coverage lexicon of French. This resource, Morphonette, will soon be made available to the 
public. 

A morphological network solves several problems posed by the morphematic approach such 
as the treatment of words such as concevoir 'conceive', decevoir 'deceive', percevoir 'perceive', 
recevoir 'receive' or consister 'consist', desister 'desist', persister 'persist', resister 'resist' where 
it is difficult to determine the status to the con-, de-, per-, re-, -cevoir or -sister sequences. The 
paradigmatic approach is also capable of bringing words such as furieux 'furious' and curieux 
'curious' into the same lexical derivational series despite the fact that furieux has a derivational 
base, furie 'fury', while the current lexicon of French contains no word that could serve as a base 
to curieux. The dissociation of the construction of meaning and form allows us to easily treat 
allomorphy, suppletion and phenomena such as interfixation that one ob serves in qout te 'drop' 
— ;> gouttelette 'droplet' or triste 'sad' — ?> tristounet 'gloomy' described bv iPlenatI (|2005r ). 

The network illustrated in figure [T] is actually made up of analogies. For instance, fructifica- 
teur:fructification participates in analogies with modificateur:modification, rectificateur:rectification, 
sanctificateur: sanctification. Similarly, fructificateur :rectificateur forms analogies with fructi- 
fier:rectifier, fructification:rectification, fructifiahle:rectifiable. Gathering all theses analogies 
poses a serious problem of complexity. For instance, for a lexicon of 97 010 entries such as 
the Tresor de la Langue Frangaise (TLF) word list, the number of quadruplets to be tested is on 
the magnitude of IQ-'^^. This number is theoretically 10^" but it can be divided by 8 by taking 
advantage of the permutations described in ([2]) where £ is a set of representations of the lexical 
units. 



V(a, b, c, d) € C , a : b :: c : d ^ a : c :: b : d A b : a :: d : c A 

b : d :: a : c A c : a :: d : b A c : d :: a : b A d : a :: c : b A d : c :: a : b 



(2) 



For the construction of the Morphonette network, we have used the phonological representations 
of the TLF headwords instead of their written forms, so reducing the size of the lexicon to 83 082 
entries and the number of quadruplets to be checked to 6 • 10^^. 

The solution we adopted for the comp l exity problem consists in using the measure of mor- 
phological similarity proposed bv iHathoutl (2008|). This measure enables us to select for a given 
entry w the words that are most likely to form analogies with w, namely the members of the 
derivational family and series of w (see section [2]). The second problem we ha ve had to solve is 
the actual verification of the analogies. We have used the same algorithm as iHathoutl (20081). 



Inspired by the one of lLepaed (|1998[ ) , this algorithm aUows us to check whether a formal analogy 
holds between four words without having to cut them into morphemes. Notice th at this algorithm 
may exceptionally fail to find some analogies. Another algorithm, proposed by IStroppal (|2005 



does not suffer from this drawback. However, we did not use it because its complexity is in o{n'^) 
while the former has a complexity in o(n?) and because these exceptional failures are largely 
compensated by the number and the redundancy of the collected analogies. The construction of 
Morphonette poses a third problem, namely the exclusion of the formal analogies that are not 
morphologically valid such as constituable : constant :: restituable : restani^ We relied on the 
structure of the morphological graph to eliminate them, namely on the fact that series contain 
large numbers of words, that they are clusters with highly connected members and that series 
are connected to each others by large numbers of edges which form analogies. Notice that the 
series of the lexicon too form a cluster. 

The remainder of the paper is organized as follows. In Section [51 we present the measure 
of formal similarity and the morphological neighborhoods where the analogies are looked for. 
Section [3] outlines the verification of the formal analogies. In Section |4l we describe in detail the 
bootstrapping algorithm we have used for the construction of this first version of Morphonette. 
The resource is presented in Section [5] Section [B] discusses some related works and finally. 
Section [7] offers a short conclusion. 

2 Morphological similarity 



We have used the measure of morphological similarity proposed by iHathouti ( 20081 ) for the con- 



struction of Morphonette. This measure brings closer the words that share large numbers of 
very specific formal and semantic features: the more features the words share and the more 
specific these features are, the closer they are. The measure is calculated by means of a bipartite 
graph where the words are connected to their features. The neighbors of a word w are identi- 
fied by spreading an activation initiated at the vertex that represents w. First, the activation 
is uniformly spread toward the features of w. Then, in the second step, the activation located 
on the features is uniformly spread toward the words that possess these properties. The level 
of activation obtained by a word x after the propagation is an estimation of the morphological 
relatedness between w and x. The spreading is simulated by means of a classical random walk 
algorithm, that is by multiplying the s tochastic adjac ency matrix of the bipartite graph. 

The measure originally proposed bv lHathoud ( 20081) uses both formal and semantic properties. 



the latter being n-grams of words extracted from the TLF definitions. We did no t retai n them 



here because they are not informative enough. Another difference with IHathouti ()2008l) is the 



us e of phonetic t ranscriptions instead of word forms. We have used the LIA_PHON phonetizer 



of iBechetl (|200lD in order to transcribe the word forms into sequences of phonemes in Mbrola 



format. Each phoneme is encoded as two characters as shown in the examples in ([3]). 

(3) constant kkonssttan 

constituable kkonssttiittuuaabbllee 

restant rraissttan 

restituable rraissttiittuuaabbllee 

The beginning and the end of the words are marked by ##. The morphological similarity is then 
estimated by associating with each word the set of all the sequences of 3 phonemes or more. For 
instance, the sequences which describe the word constant are presented in ^. 



^'constitutable', 'constant', 'restitutable', 'remaining' 



fructifier fructifiant fructificateur fructification fructifiant fructifere sanctifier recti- 
fier presanctifier fructivore fructidorien fructidorienne fructidoriser fructidor fructueusement 
fructueux fructuosite fructose obstructif constructif instructif desobstructif destructif in- 
structif autodestructif usufructuaire infructueusement sanctifiant saiictifiable rectifieuse rec- 
tifieur rectifiant rectifiable transsubstantifier substantifier stratifier cimentifier certifier savanti- 
fier refortifier ratifier presentifier pontifier plastifier notifier nettifier mortifier mythifier mystifier 
quantifier 

Figure 2: The 50 nearest neighbors of fructifier 'bear fruit'. The members of the derivational 
family are in bold face and the ones of the derivational series are in italic. 

(4) ##kkon kkonss onsstt ssttan ttan## 
##kkonss kkonsstt onssttan ssttan## 
##kkonsstt kkonssttcin onssttaii## 
##kkonssttan## 

Figure [5] presents the nearest neighbors oi fructifier 'bear fruit'. If we omit sanctifier 'sanctify', 
rectifier and presanctifier 'presanctify', we see that the members of the derivational family of 
fructifier all appear at the beginning of the list and that the end gathers the members of its 
derivational series. 

3 Formal analogy 

The measure of morphological similarity enables us to determine a morphological neighborhood 
for each word w. This neighborhood gathers a large part of the members of the derivational family 
and series of w. These members are precisely the ones with which w can form morphological 
analogies . In t his way, we can reduce drastically the search space for analogies, as proposed in 



HathoutI (J2008f ). For instance, if we limit the search to the 100 first neighbors of each word, the 
number of quadruplets to be checked for a lexicon of 83 082 entries drops to 10^°. This number 
can be further reduced by using two heuristics based on the properties (O and ^. 

V(a, b,c,d)<EC'^,a:b::c:d^ l{a) - l{b) = l{c) - l{d) (5) 

where l{x) is the number of phonemes in x. 

y{a,b,c,d)(EC'^,a:b::c:d^ (6) 

{c{a) = c{b) A c{c) = c{d)) V {c{a) = c{c) A c{b) = c{d)) 



wher e c{x) is the morphosyntactic tag of x. Morphonette uses the Grace tag set (jRaiman et al. . 



19971 ). These heuristics divide the total number of quadruplets to be checked by 50. 2 • 10® 
quadruplets have therefore been checked and 4.2 • 10^ formal analogies have been collected. In 
order to further improve the quality of these analogies, we have only kept the ones where a formal 
analogy also holds for the written forms. This additional condition eliminates phonetic analogies 
such as paissant : abaissant :: paye : abeille^. The number of analogies actually used for the 
construction of the first version of Morphonette is 3.9 • 10^. The set of these analogies is closed 
under the permutations described in ([2]). Let A be this set. 

The analogies in A have been found by using the same technique as the one of lHathoutI (|2008r ) 
which consists in computing an analogical signature for each of the pairs of words (a, b) and (c, d) 



''grazing', 'lowering', 'pay, 'bee' 



of a quadruplet (a, b, c, d). The analogical signature of a pair of words (a, b) describes a path in 
their edit lattice, that is a sequence of string edit operations, (a, 6, c, d) is an analogy if the two 
signatures are identical. This method fails to detect some analogies such as Qo 

(7) do : doable :: read : readable 

These failures being exceptional and the analogies highly redundant, it is always possible to 
recover the relat ions a : b and c : d and then the entire analogy a : b :: c : d. Notice that 
the algorithm of IStroppal ( 20051 1 is able to identify Q, but it has a complexity in o{n'^). It is 



obviously not adapted to our needs given the number of quadruplets we have to check. 

4 Morphological network 

Morphonette has been constructed by using a bootstrapping algorithm. We first selected an 
initial seed, A^Oi composed of the most reliable morphological relations and then complemented 
it iteratively with relations induced by A^o- More specifically, the 3.9 • 10^ collected analogies 
were used to define a weighted graph Q ~ {V, E, w) where V^ is a set of vertices, namely the set of 
the headwords of the TLF, E = {{a,b) GV x V/3a : 6 :: c : d G ^} a set of edges and w : £■ — > N 
a weight function such that Ve G E,w{e) = \{a : b :: c : d G A/{a,b) — e}\. Q being build from 
formal analogies, the words represented by the vertices are mainly connected to members of their 
derivational families on one hand and to members of their derivational series on the other. The 
main objective of the construction of Morphonette is to set apart these two types of relations 
and to select a set of relations with almost no error. This is because A contains formal analogies 
such as destructeur : structural :: descripteur : scriptural^ which induce morphologically invalid 
edges, namely destructeur: structural and descripteur:scriptural. 

The relations between members of the same family and members of the same series can be 
partially set apart on the basis of the categorical features of the words: two words that belong 
to the same series have identical morphosyntactic tags. As a result: 

\/a:b:: c:dG A, c{a) ^ c{b) =^ (j){a, b) A 0(c, d) A a{a, c) A a{b, d) (8) 

'ia-.b :: c: d G A, c{a) ^ c{c) ^ (j){a, c) A 0(5, d) A a{a, b) A a{c, d) (9) 

where (j){x, y) is true iff x and y belong to the same derivational family and a{x, y) is true iff x 
and y belong to the same derivational series. However, this criterion does not allow us to type 
the edges of analogies where c{a) — c{b) — c{c) — c{d) such as developpeur : developpement :: en- 
veloppeur : enveloppemeni^ which holds between four masculine singular nouns. The statements 
^ and ^ can be used to define a type function r of the analogies in A: 

f if c{a) ^ c{b) 
T{a : b :: c : d) ^ { s if c(a) 7^ c(c) (10) 

1 othewise 

We can then define the subset of E made up of the edges which connect words which may be in 
the same family: 

J" = {(a, b) e E/3a : b :: c : d e A,T{a : b :: c : d) e {i, u}} (11) 



*We thank Philippe Langlais who pointed out this problem to us. 
^'destructor', 'structural', 'descriptor', 'scriptural' 
^'developer', 'development', 'enveloper', 'envelopment' 



The partial typing of the edges in Q can be refined on the basis of two structural characteristics 
of the morphological network. These characteristics allows us to select a subgraph of G with the 
most reliable morphological relations only: 

(12) Derivational series are large sets. 

(13) Derivational series are clusters. 

The characteristic P^ allows us to identify reliable family relations. This is because two 
words a and b which belong to the same family normally participate to one analogy with each 
of the members of the series of a and of b. Series being large sets, the weight w{e) of an edge 
(a, b) connecting members of the same family is normally high. In other words, the number 
of analogies which contain a given edge can be used identify the ones which reliably connect 
members of the same family. For instance, a threshold of 10 can be used to select a set which 
only contains family edges. Let J'q = {e G J^/w{e) > 10} be this set. We can then rely on J"o to 
identify relations between words which belong to the same series: 

Wa : b :: c : d e A, {a,b) e !Fo ^ cr(a, c) A a{b, d) (14) 

J-Q can therefore be used to extract a subgraph Qq from Q composed with serial relations induced 
by the reliable familial relations in J-q: 

So = {(a, c) e E/3{a, b) e Jq and 3a -.b :: c: d e A} (15) 

Go = J-qUcSo (16) 

The structure we get is actually more complex. This is because one word c can belong to 
the series of a word a when a is in a relation with a member b of its family but not belong to 
the series of a when a is in a relation with another member b' . For instance, artificiel 'artifi- 
cial' belongs to the same series as officiel 'ofhcial' and troisieme 'third' when it is in a relation 
with artificiellement 'artificially' but it is only in the same series as officiel when in a rela- 
tion with artificialiser 'artificialize'. In the first case, artificiel: artificiellement forms analogies 
with officiel: officiellement 'officially' and troisieme: troisiemement 'thirdly'; in the second, ar- 
tificiehartificialiser only forms an analogy with officiel: officialiser 'officialize' but none with a 
pair having troisieme as its first member. In other words, each entry belong to as many distinct 
sub-series as there are members in its family. Thus, the morphological structure of the lexicon 
consists in a set of filaments of the form (a, b, series{a, b)) where a is an entry, b a member of 
its family and series {a, b) — {c G V/3a : b :: c : d € A} the sub-series of a when we consider its 
relation with b. Actually, the filaments of an entry a are just a representation of the set of the 
analogies which contain a\j Filaments are illustrated in figure [31 

The characteristic p^ is then used to enhance the selection of the most reliable edges in Q 
starting from the most central serial relations. Even if almost all the familial relations in J-q are 
correct, we need to eliminate the ones that may yield errors when the initial seed is extended, 
and especially the ones that connect distinct families. These connections primarily concern 
compounds such as zoophilie 'zoophilia' which belong to the family of zoologie 'zoology' zoophobie 
'zoophobia', etc. and to the one of anthropophilie 'anthropophilia', bibliophilie 'bibliophilia', etc. 
depending on whether we consider its radical is zoo or philie. In this case, we eliminate the relation 
between zoophilie and anthropophilie by relying on the fact that zoophilie has predominantly 



^ Let us notice that filaments could be defined in a dual manner from the derivational series. In this case, a 
filament of an entry a is a triplet (a, b, family {a, b)) where 6 is a member of the series of a and family{a, b) is the 
sub-family of a when we consider its relation with b. Both types of filaments being equivalents, we have used the 
first one because it yields a more compact description of the graph. 



gazouillarde gazouillage 

cafouillarde grenouillarde vasouillarde 

gazouillarde gazouillement 

braillarde geignarde grognarde 

gazouillarde gazouiller 

citrouillarde douillarde grenouillarde rouillarde souillarde vadrouillarde vasouillarde 

Figure 3: Three filaments of the entry gazouillarde 'twittering female' 

words ending in -philie in its series and that these words do not have words starting with zoo- in 
their series. Put differently, the words starting with zoo- are not well connected within the central 
cluster of the series of zoophilie. We classically measure the clustering coefficient of a word c 
within the series of a w ord a by the ratio of the n umber of triangles to the number of triples which 
contain the edge (a, c) (jWatts fc Strogatj . ll998 n. Let so(a) = {c G V/{a, c) G Sq} be the series of 



a. Then the number of triples formed by a and one word c £ So(a) is |so(o)| — 1- The number of 
triangles that a word c G so(a) form with other members of so{a) is |(so(a) \ {c}) fl (so(c) \ {a})|- 
A threshold of 0.66 has been used for the construction of Morphonette. It allows us to reduce 
the series to their most central clusters. For series so{a), this cluster can be defined as in (flT)) . 

'f \ / ^ f ^ / l(go(Q) \ {c}) n (so(c) \ {a})\ 
So(a = {c e so(a / — -— > 0.66} 17 

|so(a)| -1 

This reduction is then used to remove from Tq the edges (a, b) such that series{a, b) fl s'^la) = 0. 
The resulting graph is the initial seed A^o- 

M-a is then iteratively extended until a fixed-point is reached. At step i, we generate all 
the formal analogies induced by the transitive closures of the families of AAi. These analogies 
a : b :: c : d consists of to pairs (a, 6) and (c, d) such that 3(^1,^2) £ 7i x Ti,{a,b) G ii x 
ti and (c, d) G ^2 X ^2 where % is the transitive closure of the families of Mi. We then reduced 
the graph induced by these analogies to its intersection with Q and added this extension to Mi 
in order to yield A4i+i. We actually impose to the extension an additional condition: for i > 2, 
only the filaments with a sub-series of 5 words or more are kept. The fixed-point is reached in 8 
iterations. The Morphonette network is the constructed by merging TWg with Qq. 

5 Morphonette 0.1 

This first version of Morphonette comprises 29 310 entries and 96 107 filaments, and therefore the 
same number of familial relations. The number of distinct families has not been computed. The 
network contains 1 160 098 serial relations, that is 12 per filament in average. These numbers 
can be compared with the ones of G, the graph from which this network has been extracted. 
Q comprises 75 832 entries, 816 922 filaments (that is 10 per entry in average, against only 3 
in Morphonette) et 2 343 059 serial relations (that is less than 3 per filament). Morphonette 
therefore already covers about 40% of the entries of the lexicon. Figure [3] presents an excerpt of 
this resource consisting of three filaments of the noun gazouillarde 'twittering female'. 

A first estimation of the quality of Morphonette has been performed by manually checking 
200 filaments randomly extracted from the network. Only one erroneous relation has been found 
between pension and pensif 'pensive' which puts the precision above 99%, if confirmed by a more 
thorough evaluation. Even if pension and pensif are etymologically related, there is nowadays 



no semantic relation between them. However, pension and pensif participate to a large number 
of formal analogies which wrongly put pension in the extended series of deverbal nouns ending in 
-ion. The loss of the semantic relation between pension and pensif can only be detected on the 
basis of semantic information. But Morphonette 0.1 has been constructed only from the formal 
properties of the TLF headwords. 

Morphonette 0.1 also contains some errors due to formal accidents such as the relation between 
degrimer 'remove the make-up' and degression 'degression' which belongs, from a formal point 
of view, to the series of deprimer: depressior^, comprimer:compressio'nQ, etc. Once again, the 
use of semantic knowledge should be the best way to find out and eliminate this type of errors. 
Another line of investigation would be to generalize the notion of analogy to sets of three pairs 
or more in order to determine the invariants of the sub-series. 

Another difficulty we will have to address is the treatment of homonyms and homographs. 
For instance, the four meanings of fraise ('strawberry', 'mesentery', 'ruff', 'drill') induce four 
distinct derivational families even if the three latter meanings are etymologically related. In 
Morphonette 0.1 these families are confused. We will use the homonyms numbers in the TLF 
entries and the semantic information present in the definitions to separate them in future versions 
of Morphonette. 

6 Related Works 



From a theore tical po i nt of view, this work belongs to a framework r elated to the Network 
Morphology of iBvbed (119951). to the Su r face-to-Surface Mo rphology onBurziol (|2002l ). and to 
emergentist approaches of ' Aronofj (|l994[ ). lAlbrightl (|2002[) or lGoldsmithI (|2006l ). 

The construction of Morphonette uses a bootstrapping algorithm in order to extend an initial 
rel iable seed. This te chniqu e has often also b een used in computational morphology, for instance 
bv iGoldsmithI ( 20061 ) or by iBernhardI ( 20061 ). However, our method differs from these ones be- 
cause it is fully lexeme-based and does not make use of morpheme nor contain any representation 
of them. Morphological regularities emerge from a very large set of analogies. Gathering of this 
set is one of contributions of the work presented in th is paper. It wa s made possible through 
the use of the measure of morph ological similarity o f iHathouti (2008|). This measure was in- 
spired by work on small words b v lGaume et al.l (12002^. O ur method is also close to the ones of 
Yarowskv fc Wicentowskil ( 20001 ) and lBaroni et al.l ( 2002 ) where the words are not decomposed 
into morphemes. Both make use of string edit distance to identify formal similarity bet ween 
words. Our work is also cl ose to the one bv lStroppa fc YvonI (|2005l ). iLanglais et al.l (|2009h and 
Lavallee fc Langlaid ( 20091 ) who use formal analogies to analyze words morphologically and to 
translate them. 

The Morphonette n etwork could als o b e compared to th e morphological families constructed 
bv IXu fc Crofti (|l998l ). iGaussieiJ (|l999r ) or iBernhardI (|2009l ) among others. It i s also very close 
to Polymots, a manually-constructed morphological lexicon ( Gala et al.l 120101) . Polymots and 
Morphonette are complementary since the former primarily contains short words while the latter 
mainly contains long words because of the criteria we have used to select the morphological 
relations. 

With respect to these related works, the main contribution of Morphonette is first the genera- 
tion of a collection of more than 4 millions formal analogies and the exploitation of the structural 
properties of the morphological graph in order to set apart the familial and the serial relations. 



''depress', 'depression' 
^'compress', 'compression' 



7 Conclusion and directions for further research 

We have presented in this paper Morphonette, the first morphological network of French. This 
network is constructed without decomposition of the words into morphemes. The method we 
have used rely on the structural properties of a graph of morphological relations build from a 
collection of almost 4 millions formal analogies. Morphonette is made up of filaments which are 
composed of an entry, a member of its derivational family and derivational sub-series of similar 
words. It allows us to redefine the morphological analysis task which does not aim to decompose 
words into morphemes but aims to identify their derivational families and series by means of a 
set of filaments. 

Morphonette will soon be distributed under Creative Commons licence. A thorough evalua- 
tion of its relations will also be carried out shortly. A second version of this resource will be de- 
veloped by designing a measure of semantic relatedness able to differentiate between homonyms, 
to spot out the formal accidents and to identify allomorphy and suppletion. This measure will 
be based on the relations in Morphonette 0.1 which will be used to select the semantic properties 
and relations which are the most informative from a morphological point of view. 
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